The Workshops of the Tenth International AAAI Conference on Web and Social Media
نویسندگان
چکیده
Semantic relatedness between words has been extracted from a variety of sources. In this ongoing work, we explore and compare several options for determining if semantic relatedness can be extracted from navigation structures in Wikipedia. In that direction, we first investigate the potential of representation learning techniques such as DeepWalk in comparison to previously applied methods based on counting co-occurrences. Since both methods are based on (random) paths in the network, we also study different approaches to generate paths from Wikipedia link structure. For this task, we do not only consider the link structure of Wikipedia, but also actual navigation behavior of users. Finally, we analyze if semantics can also be extracted from smaller subsets of the Wikipedia link network. As a result we find that representation learning techniques mostly outperform the investigated co-occurrence counting methods on the Wikipedia network. However, we find that this is not the case for paths sampled from human navigation behavior.
منابع مشابه
The Workshops of the Tenth International AAAI Conference on Web and Social Media
Littar was the second-prize winning entry in an app competition. It implemented a system for visualizing places mentioned in individual literary works. Wikidata acted as the backend for the system. Here I describe the Littar system and also some of the issues I encountered while developing the system: How locations and literature can be related, what types of location-literature relations are p...
متن کاملThe Workshops of the Tenth International AAAI Conference on Web and Social Media
We investigate the automatic generation of Wikipedia articles as an alternative to its manual creation. We propose a framework for creating a Wikipedia article for a named entity which not only looks similar to other Wikipedia articles in its category but also aggregates the diverse aspects related to that named entity from the Web. In particular, a semi-supervised method is used for determinin...
متن کاملThe Workshops of the Tenth International AAAI Conference on Web and Social Media
Wikipedia follows an open editing model that allows anyone to edit any entry. This has led to questions about the credibility and quality of information on it. Yet, it remains one of the most widely visited online encyclopedias. In this paper, we present a discussion of the various factors that influence the trust that users have on Wikipedia through a framework consisting of personal, social a...
متن کاملThe Workshops of the Tenth International AAAI Conference on Web and Social Media
Despite the tremendous amount of information on Wikipedia, only a very small amount is structured. Most of the information is embedded in unstructured text and extracting it is a non trivial challenge. In this paper, we propose a full pipeline built on top of DeepDive to successfully extract meaningful relations from the Wikipedia text corpus. We evaluated the system by extracting company-found...
متن کاملThe Workshops of the Tenth International AAAI Conference on Web and Social Media
Event detection in social media usually exploits information from social-networking platforms, such as Twitter or Facebook. However, previous research has suggested that Wikipedia constitutes a valuable source of information for the task of detecting breaking news. In this work we adapt a graph-based algorithm to the Wikipedia context, and compare it to the state-of-the-art Wikipedia real-time ...
متن کاملThe Workshops of the Tenth International AAAI Conference on Web and Social Media
As a global, multilingual project, Wikipedia could serve as a repository for the world’s knowledge on an astounding range of topics. However, questions of participation and diversity among editors continue to be burning issues. We present the first targeted study of participants at Greek Wikipedia, with the goal of better understanding their motivations. Smaller Wikipedias play a key role in fo...
متن کامل